All-CLL: A Capture-based Next-generation Sequencing Panel for the Molecular Characterization of Chronic Lymphocytic Leukemia

hronic lymphocytic leukemia (CLL) is biologically and clinically very heterogeneous with a compendium of immunogenetic and genomic factors dictating its course. 1–4 The mutational status of the immunoglobulin (IG) heavy chain variable region (IGHV) and mutations/ deletions of TP53 are the 2 main molecular markers with prognostic and predictive value, and current guidelines recommend their characterization for a full prognostic evaluation and management of the patients. 2,3 These 2 biomarkers are usually assessed separately using independent Sanger sequencing and/ or next-generation sequencing (NGS) experiments in most diagnostic laboratories, 5–7 whereas copy number alterations (CNAs) including deletions (del) of 17p, 11q, and 13q and trisomy 12 are mostly analyzed by fluorescence in situ hybridization (FISH). In addition, the study of mutations in BTK , PLCG2 , and BCL2 is recommended at the time of progression after treatment with BTK and/or BCL2 inhibitors. Although not used for clinical-de-cision making, mutations in genes such as SF3B1, NOTCH1 , and the recently described IGLV3-21 R110 mutation 8 might also have prognostic/predictive value.

subsequently annotated using IMGT/V-QUEST 19 and ARResT/ AssignSubsets 20 following European Research Initiative on CLL (ERIC) guidelines 5 (Supplemental Digital Content).11]14 This cohort was selected to have a full representation of the main genomic and immunogenetic CLL drivers.We subsequently applied the all-CLL panel in a prospective cohort of 87 patients, in which the tumor cell content of the samples ranged from 19% to 99%.IGHV gene rearrangements and SHM status were analyzed in parallel by Sanger sequencing following ERIC guidelines, 5 and FISH was performed to determine the status of 17p13.1/TP53,11q22.3/ATM,13q14.3, and trisomy 12 (Figure 1B; Supplemental Digital Content; Suppl.Table S3).In addition, 22 samples from the retrospective (n = 6) or prospective (n = 16) cohorts were analyzed in 2 independent NGS rounds to assess technical reproducibility.A summary of the performance metrices described below can be found in Suppl.Table S4.
Regarding gene mutations in the retrospective cohort, we previously reported 12 mutations in 8 cases by whole-genome/ exome sequencing 9 and 26 mutations in 12 cases by deep NGS (variant allele frequency [VAF] >2%). 11No mutations were previously reported in the remaining 5 cases.We identified 37 of 38 (97%) of these mutations using the all-CLL panel (Figure 1C; Suppl.Table S5).The mutation missed (SF3B1 p.K700E) was previously reported 11 with a VAF of 5.6% but found at 0.92% of VAF by the all-CLL, which was called but filtered out due to the cutoff of 2% of VAF used during the analysis.The all-CLL panel also identified 19 mutations (7 single-nucleotide variants and 12 indels) in 10 samples that were not previously reported. 11Most of these mutations (15/19; 79%) were detected at VAF <6% and appeared to be true positive calls after a careful manual review (Suppl.Figure S1; Suppl.Table S6).Twenty-six mutations were called by the all-CLL assay in the 6 samples analyzed in 2 independent experiments.A reproducibility of 100% was observed for mutations present at a VAF ≥3%, while 3 mutations that were only detected in 1 of the 2 NGS rounds had a VAF <3% (Suppl.Figure S2).In the prospective cohort, at least 1 mutation classified as pathogenic or likely pathogenic was identified in 50 of 87 (57.5%)CLL, including mutations in BTK, PLCG2, or   Concordance S e n s it t iv it y S p e c if ic it y 92 89 Figure 1.Design of the all-CLL, validation process, and benchmark of gene mutations and CNAs.(A) Summary of the all-CLL workflow including the interrogated regions of interest, bioinformatic workflow, and output.Turnaround time of library preparation, sequencing, and bioinformatic analyses are summarized at the bottom.Note that multiple samples can be processed and analyzed in a single experiment (further details can be found in Supplemental Digital Content).(B) Schematic representation of the validation process.(C) Sensitivity of the all-CLL workflow to detect gene mutations according to previously published analyses by WGS/WES 9 and deep NGS 10,11 ; n, number of mutations reported in the previous studies.The maroon crossbar represents the mean value.(D) Sensitivity, specificity, and concordance of CNAs between the all-CLL and gold-standard result in the retrospective (copy number arrays and/or FISH) [9][10][11] and prospective cohort (FISH).The maroon crossbar represents the mean concordance value; n, number of cases with CNAs results by all-CLL and gold-standard techniques.Numbers in brackets show the number of samples carrying each alteration (del, deletion; tri, trisomy) or no alterations (wt) by FISH and/or copy number arrays.CNA = copy number alteration; CLL = chronic lymphocytic leukemia; FISH = fluorescence in situ hybridization; indels = short insertions/deletions; NGS = next-generation sequencing; SNV = single-nucleotide variants; WGS/WES = whole-genome/exome sequencing.
BCL2 in samples collected at relapse after treatment with BTK or BCL2 inhibitors, respectively (Supplemental Digital Content; Suppl. Figure S3; Suppl.Table S7).The reanalysis of 16 samples from this prospective cohort in 2 NGS rounds showed a 100% reproducibility for mutations present at a VAF ≥3% (Suppl.Figure S2; Suppl.Table S7).
CNAs were called by the all-CLL panel in 24 of 25 (96%) samples from the retrospective cohort (one sample failed coverage uniformity confidence threshold for CNA analyses).A 99% concordance was observed for the 4 CNAs studied between previously reported results by copy number arrays and/or FISH 9,11 and the all-CLL panel.The only discrepancy was a del(13q) found in 13% of the cells by FISH in 1 sample, which was missed by the all-CLL panel and copy number arrays (Figure 1D; Suppl.Tables S8 and S9).(A) Sensitivity of the all-CLL workflow to identify a productive IGH gene rearrangement in the retrospective cohort together with the concordance of the IGHV gene SHM status, identified V(D)J genes, and CDR3 amino acid sequence compared with gold-standard results. 14(B) Dot plot of the percentage of identity of the rearranged IGHV sequence to the germ line by the all-CLL (y-axis) and gold-standard (x-axis) 14 in the retrospective cohort.(C) Concordance of the IGLV3-21 gene rearrangement and R110 status identified by the all-CLL compared with gold-standard data. 14(D) Sensitivity of the all-CLL to identify a productive IGH gene rearrangement in the prospective cohort together with the concordance of the IGHV gene SHM status, identified V(D)J genes, and CDR3 amino acid sequence compared with Sanger sequencing (gold-standard).(E) Dot plot of the percentage of identity of the rearranged IGHV sequence to the germ line by the all-CLL (y-axis) and gold-standard (x-axis) in the prospective cohort.Full-length IGH gene rearrangements by both approaches were used (n = 50).CLL = chronic lymphocytic leukemia; IGHV = immunoglobulin heavy chain variable region; SHM= somatic hypermutation.
We next assessed the value of the all-CLL workflow to characterize IG gene rearrangements.We detected a full-length, productive IGH V(D)J gene rearrangement in all 25 CLL from the retrospective cohort (Suppl.Table S11).The results were concordant with previously published results 14 regarding the rearranged IGH V(D)J genes in all tumors and in all but one in terms of CDR3 amino acid sequence (1 amino acid difference in the discordant sample).All tumors were equally classified according to the major stereotyped subsets (1 CLL#1, 4 CLL#2, 1 CLL#8, and 19 unassigned) and as mutated (n = 10) or unmutated (n = 15) IGHV using the 98% cutoff (Figure 2A).In addition, the same percentage of IGHV identity compared with the germ line sequence was determined in virtually all samples, including those with a borderline IGHV identity (97%-97.99%)(Figure 2B; Suppl.Figure S4).This retrospective cohort was intentionally biased toward IGLV3-21-expressing CLL to determine the robustness of the all-CLL panel to define IGLV3-21 R110 status.Among the 25 CLL studied, 13 tumors expressed the IGLV3-21 gene, either the IGLV3-21*04 (n = 11) or IGLV3-21*02 (n = 2) allele.Among the 11 CLL expressing the IGLV3-21*04, ten carried the R110 mutation. 14A concordant result was obtained using the all-CLL approach in all samples (Figure 2C; Suppl.Table S12).In the prospective cohort, a full-length, productive IGH V(D)J gene rearrangement was detected by all-CLL in 83 of 87 (95.4%) samples, 8 belonging to a major stereotyped subset (2 of them CLL#2; Suppl.Table S13).Sanger sequencing performed in parallel in 62 of 87 randomly-selected CLL identified a productive IGH V(D)J gene rearrangement in 57 of 62 (91.9%) samples.When comparing the results of the all-CLL and Sanger sequencing, we found a 100% concordance for IGHV gene SHM status and stereotyped subset assignment, 97.9% concordance regarding the rearranged V(D)J genes identified, and 93.8% in terms of CDR3 amino acid sequence (Figure 2D).In addition, virtually the same IGHV identity compared with the germ line was identified by all-CLL and Sanger sequencing in the 50 full-length IGH rearrangements identified by both techniques, including CLL with borderline IGHV mutations (Figure 2E; Suppl.Figure S5).In this prospective cohort, 5 of 87 (5.8%) CLL carried the IGLV3-21 R110 .In line with previous studies, 13,14 these 5 CLL included the 2 cases classified as CLL#2 and all 5 carried a borderline IGHV gene SHM status (96.88%-98.26%;Suppl.Figure S3; Suppl.Tables S13 and S14).Finally, the reproducibility experiments in the 2 cohorts showed a 100% concordance for IGHV gene SHM and IGLV3-21 R110 status and ≥94% concordance in stereotyped subset assignment (Suppl.Figure S6; Suppl.Tables S11-S14).
In summary, we have described here the development of the all-CLL, an integrative capture-based NGS solution able to determine in a single experiment the main immunogenetic and genomic markers of prognostic/predictive value in CLL, including full-length IGH V(D)J rearrangements, IGHV gene SHM status, IGLV3-21 R110 , CNAs and driver gene mutations (VAF ≥3%).The all-CLL panel, coupled with the designed bioinformatic workflow, facilitates the molecular characterization of CLL using a single DNA sample as starting material and a single experimental workflow, which reduces costs and turnaround time.The results of this study might thus encourage others to adopt NGS approaches for a more comprehensive (immuno) genomic characterization of CLL, which will contribute to a better understanding of the clinical value of these drivers under distinct treatment options and, ultimately, to advance toward a more personalized management of the patients.Overall, integrative NGS approaches, such as the all-CLL panel presented here, might facilitate the routine molecular characterization of CLL, while providing a complete evaluation of the immunogenetic and genomic drivers of the disease.

DISCLOSURES
EC has been a consultant for Takeda, NanoString, AbbVie, and Illumina; has received honoraria from Janssen, EUSPharma, and Roche for speaking at educational activities and research funding from AstraZeneca and is an inventor on 2 patents filed by the National Institutes of Health, National Cancer Institute: "Methods for selecting and treating lymphoma types," licensed to NanoString Technologies, and "Evaluation of mantle cell lymphoma and methods related thereof," not related to this project.DC has received honoraria from AbbVie and AstraZeneca for speaking at educational activities.FN has received honoraria from Janssen, AbbVie, and SOPHiA GENETICS for speaking at educational activities.FN and EC have licensed the use of the protected IgCaller algorithm to Diagnóstica Longwood.All the other authors have no conflicts of interest to disclose.

Figure 2 .
Figure2.Benchmark of full-length IGH gene rearrangement, SHM status, and IGLV3-21 R110 .(A) Sensitivity of the all-CLL workflow to identify a productive IGH gene rearrangement in the retrospective cohort together with the concordance of the IGHV gene SHM status, identified V(D)J genes, and CDR3 amino acid sequence compared with gold-standard results.14(B) Dot plot of the percentage of identity of the rearranged IGHV sequence to the germ line by the all-CLL (y-axis) and gold-standard (x-axis)14 in the retrospective cohort.(C) Concordance of the IGLV3-21 gene rearrangement and R110 status identified by the all-CLL compared with gold-standard data.14 (D) Sensitivity of the all-CLL to identify a productive IGH gene rearrangement in the prospective cohort together with the concordance of the IGHV gene SHM status, identified V(D)J genes, and CDR3 amino acid sequence compared with Sanger sequencing (gold-standard).(E) Dot plot of the percentage of identity of the rearranged IGHV sequence to the germ line by the all-CLL (y-axis) and gold-standard (x-axis) in the prospective cohort.Full-length IGH gene rearrangements by both approaches were used (n = 50).CLL = chronic lymphocytic leukemia; IGHV = immunoglobulin heavy chain variable region; SHM= somatic hypermutation.